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Abstract 

An encoder, subject to a rate constraint, wishes to describe a Gaussian source under squared error distortion. 
The decoder, besides receiving the encoder's description, also observes side information through a separate uncoded 
channel subject to slow fading. The decoder knows the fading realization but the encoder knows only its distribution. 
The rate-distortion function that simultaneously satisfies the distortion constraints for all fading states is given by 
^ • Heegard and Berger A layered encoding strategy is considered in which each codeword layer targets a given fading 

, state. When the side-information channel has two discrete fading states, the expected distortion is minimized by 

■ optimally allocating the encoding rate between the two codeword layers. For multiple fading states, the minimum 

\ expected distortion is formulated as the solution of a convex optimization problem with linearly many variables and 

constraints. When the fading distribution is continuous and quasiconcave, it is shown that a single-layer discrete 
rate allocation is optimal. Hence for fading distributions such as Rician, Nakagami, and log-normal, it suffices for 
the encoder to target only a single side-information channel condition. Furthermore, under Rayleigh fading, it is 
optimal to encode with a single codeword layer as if the side information was absent. 
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Index Terms 



' Convex optimization, distortion minimization, fading channel, Heegard-Berger, rate-distortion function, side 

' information, source coding. 
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I. Introduction 



lossy data compression, side infonnation at the decoder can help reduce the distortion in the 
J. reconstruction of the source [[Hi. The decoder, however, may have access to the side information only 



Q through an unreliable channel. For example, in distributed compression over wireless sensor networks, 
^ correlated sensor measurements from a neighboring node may be available to the decoder through a 
fading wireless channel. In this work, we consider a Gaussian source where the encoder is subject to 
^ a rate constraint and the distortion metric is the mean squared error of the reconstruction. In addition 
c3 ' to the compressed symbol, we assume that the decoder observes the original symbol through a separate 
analog fading channel. We assume, similar to the approach in [[2|[, that the fading is quasistatic, and that 
the decoder knows the fading realization but the encoder knows only its distribution. The rate-distortion 
function that dictates the rate required to satisfy the distortion constraint associated with each fading state 
is given by Heegard and Berger in [[31 . We consider a layered encoding strategy based on the uncertain 
fading realization in the side-information channel, and optimize the rate allocation among the possible 
fading states to minimize the expected distortion. In particular, we formulate the distortion minimization 
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as a convex optimization problem, and develop an efficient representation for the Heegard-Berger rate- 
distortion function under which the optimization problem size is linear in the number of discrete fading 
states of the side-information channel. When the fading distribution of the side-information is continuous 
and quasiconcave, we show that a single-layer discrete rate allocation is optimal. 

When the side-information channel exhibits no fading, the distortion is given by the Wyner-Ziv rate- 
distortion function [4J. Rate-distortion is considered in Q, O when the side information is also available 
at the encoder, and in [7J when there is a combination of decoder-only and encoder-and-decoder side 
information. Successive refinement source coding in the presence of side information is considered in 
IfSl , BU. The side-information scalable rate-distortion region is characterized in [fTOl , in which the user 
with inferior side information decodes an additional layer of source-coding codeword. Lossless source 
coding with an unknown amount of side information at the decoder is considered in [QTI . in which a 
fixed data block is broadcast to different users in a variable number of channel uses fTT\. In (13), [|T4ll . 
expected distortion is minimized in the transmission of a Gaussian source over a slowly fading channel in 
the absence of channel state information at the transmitter (CSIT). Broadcast transmission with imperfect 
CSIT is considered in [|T5l . Another application of source coding with uncertain side information is in 
systematic lossy source-channel coding [16J over a fading channel without CSIT. For example, when 
upgrading legacy communication systems, a digital channel may be added to augment an existing analog 
channel. In this case the analog reception then plays the role of side information in the decoding of the 
description from the digital channel. In IfTTl , IfTSl , hybrid digital/analog and digital transmission schemes 
are considered for Wyner-Ziv coding over broadcast channels. The system model studied in this paper is 
also related to distributed source coding over multiple links [1911 , [|20 ll. where, besides source coding over 
a finite-capacity reliable link, noisy versions of the source are described through additional backhaul links 
with infinite capacity but that are subject to random failure. At the decoder, the realized quality of the 
side information is determined by the number of backhaul links that are successfully connected. Similar 
models are considered in [21|, ll22l for distributed unreliable relay communications. 

The remainder of the paper is organized as follows. The system model is described in Section |Ill 
Section Hill derives the minimum expected distortion and presents the convex optimization framework when 
the side-information channel has discrete fading states. Section |IV] investigates the optimal rate allocation 
under different fading distributions in the side-information channel. Section |V] considers continuous 
quasiconcave fading distributions and the optimality of single-layer rate allocation. Conclusions are given 
in Section |VIl 

II. System Model 
A. Source Coding with Fading Side-Information Channel 

Consider the system model shown in Fig. [TJ An encoder wishes to describe a real Gaussian source 
sequence {X} under a rate constraint of Rx nats per symbol, where the sequence of random variables 
are independent identically distributed (i.i.d.) with X ~ A/'(0,(t|^). The decoder, in addition to receiving 
the encoder's description, observes side information Y' , where Y' = \fSX + Z, with Z ~ i.i.d. A/'(0, 1). 
Hence the quality of the side information depends on S, the power gain of the side-information channel. 
We assume 5 is a quasistatic random variable: it is drawn from the distribution /(s) at the beginning 
of each transmission block and remains unchanged through the block. The decoder knows the realization 
of S, but the encoder knows only its distribution given by the probability density function (pdf) f{s). 
The decoder forms an estimate of the source and reconstructs the sequence {X}. We are interested in 
minimizing the expected squared error distortion E[Z}] of the reconstruction, where D = {X — X)^. 

Suppose the side-information channel has M discrete fading states. Let the probability distribution of 
S be given as follows: 

M 

Vi{S = Si}=pi, ^ = 1,...,M, Y.P^ = ^ 

i=l 
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Fig. 1. Source coding with fading side-information channel. 



where the Sj's are enumerated in ascending order < si < S2 < ■ ■ ■ < sjv/. Let Y/ denote the side 
information under fading state Si 

r/^v^X + Z, t = l,...,M. (2) 

Note that the set of side information random variables are stochastically degraded. Let Xi be the recon- 
struction when side information Y- is available at the decoder, and Di be the corresponding squared error 
distortion. The minimum expected distortion under rate constraint Rx is then given by 

E[D]* = min p^D (3) 

r>:R(D)<Rx 

where p = [pi ■ ■ -Pm]^, D = [Di . . . Dm]'^, and R(D) is the rate-distortion function that simultaneously 
satisfies the distortion set D. 



B. Heegard-Berger Rate-Distortion Function 

The rate-distortion function that dictates the rate required to simultaneously satisfy a set of distortion 
constraints associated with a set of degraded side-information random variables is given by Heegard and 
Berger in [3] (an alternate form for M = 2 is described in [I23II ). When the side information random 
variables satisfy the degradedness condition X ^ Ym ^ Ym-i f-)- Fi, the rate-distortion function 

is 

M 

i?HB(D)= min y^I{X-W,\Y,,Wl-^) (4) 

^ 1=1 

where Wl denotes the vector Wi, . . . ,Wi. The minimization takes place over -P(D), the set of all W^^ 
jointly distributed with X, Y^^ such that: 

W^' o X o Ym ^ Ym-i ^■■■^Y^ (5) 

and there exists decoding functions W'{y% under given distortion measures rf/s that satisfy 

E[di(X,X,)] < A, z = l,...,M. (6) 

As noted in since i?HB(D) depends on X,Y^' only through the marginal distribution p{x,yi), 
i = 1, . . . , M, the degradedness of the side information need not be physical. We construct Y-^ to have 
the same marginals as F'f^ by setting p{yi\x) = p{yi\x), i = 1, . . . , M. The rate-distortion function i?(D) 
in (|3]) is then given by the Heegard-Berger rate-distortion function with squared error distortion 
measures di{X,X,i) = {X — Xj)^. 
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III. Minimum Expected Distortion 
A. Gaussian Source under Squared Error Distortion 

The Heegard-Berger rate-distortion function _Rhb(-Di, . . . , Dm) for a Gaussian source under squared 
error distortion is given in [31, [|91, [jlOl . For M = 2, fS\ describes the Gaussian rate-distortion function 
where the worse fading state corresponds to no side information, and [9| considers side information 
with different quality levels. The Gaussian Heegard-Berger rate-distortion function is considered in [ fTO| 
for M > 2. However, the representations of Rub{Di, . . . , Dm) described in [|3l, dH are characterized 
by exponentially many distinct regions, and [fTOl involves optimal Gaussian random variables whose 
variances are determined by an algorithmic procedure. These characterizations, though complete, are not 
amenable to efficient minimization of the expected distortion. In this section, we derive a representation for 
i?HB(-Di, . . . , Dm) that can be incorporated in an optimization framework. In particular, we formulate the 
distortion minimization as a convex optimization problem where the number of variables and constraints 
are linear in M. First we consider the case when the side-information channel has only two discrete fading 
states (M = 2); in Section IIII-CI we extend the analysis to multiple fading states where M > 2. 

When the side-information channel has two fading states, the encoder constructs a source-coding scheme 
that consists of two layers of codewords. The base layer is designed to be decodable under either channel 
condition, while the top layer is only decodable under the more favorable channel realization. We derive 
the rate requirements of the two codeword layers, and optimally allocate the encoding rate Rx between 
them to minimize the expected distortion. For M = 2, the Heegard-Berger rate-distortion function is 
given by 



RnB{DuD2) = min {I{X;Wi\Yi) + I{X;W2\Y2,Wi)}. 

Wi,W2eP{Di,D2) 



(7) 



For a Gaussian source under a squared error distortion measure, a jointly Gaussian codebook is optimal 
S, [HI, IfTOl . When Wf'-' ,X are jointly Gaussian, the mutual information expressions in (|7]) evaluate to 



I{X■,W^\Y^) + I{X;W2\Y2,W,) 



(8) 



= log(si + _ i log(l + (S2 - s^)VAR[X\Y^, Wi]) - ^ log(VAR[X|y2, W,, W^]) (9) 

where log is the natural logarithm, and ^ follows from expanding the conditional variance expressions 
by applying Lemma \T\ and Corollary \T\ as given below. 

Lemma 1. Let X, be jointly Gaussian random variables. IfY = y/sX + Z, where Z ~ A/'(0, 1) is 
independent from X, W^, then 

VAR[X|F, W^] = {YAR[X\W^]-^ + s)~\ (10) 

Proof: The lemma follows from the minimum mean square error (MMSE) estimate of Gaussian 
random variables. Let X, W, where W = [Wi . . . Wk]'^, be distributed as 

AT 



W 

X 



/iw 
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The conditional distribution is Gaussian, and the corresponding variance is 



VAR[A:|y, W] = a 



X 





T r 







wx -^^x + 1 
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1 -f s(S^j^Sw """Swx) 



(VAR[X|W]-i + s) 



-1 



(11) 

(12) 

(13) 
(14) 
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Corollary 1. Let Yj = + Z,Yi = y/s^X + Z. 

VAR[X\Y^,W,^] _ ^ , . . _ ,.vARfX|y. W'] (15) 
VARlX\Y~Wf] ~ ^ ' aJVAR[A|n,l4/,J. (15) 

To characterize the Heegard-Berger rate-distortion function i?HB(-Di, D2), we substitute ^ in ([7]), and 
minimize over Wi, W2 



RnB{Di,D2) = -^log(si + + min|-^log(l + (^2 - s,)YKR[X\Y,,W,]) 

+ mm{-l log(VAR[X|F2, W^i, W2])}}. 
W2 Z J 



(16) 



Note that S2 > si > by assumption. Accordingly, in the inner minimization in (fT6l) . -Rhb(-Di, 1^2) 
is decreasing in VAR[X|F2, W^i, W/2]- Hence the choice of W2 is optimal when VAR[X|F2, W^i, ^^2] is 
increased until one of its upper bound constraints is tight 

max VAR[X|r2, W^i, W2] = min(VAR[X|r2, Wi], D2). (17) 
W2 

The optimal W2 that achieves (flTl) is presented subsequently. The first term in the min(-) expression 
in (flTI) follows from the non-negativity of the mutual information /(X; W2\Y2, Wi), and the second one 
follows from the distortion constraint on X2 as given in ^ 

VAR[X\Y2,W^,W2] = E[{X-X2iy2,W,,W2)Y] < D2. (18) 

Applying Corollary [H the first term in (flTI) evaluates to 

VAR[X\Y2,Wi] = {VAR[X\Y^,W^]'' + S2-s^y\ (19) 

Under optimal 1^25 therefore, the Heegard-Berger rate-distortion function in (fT6l ) reduces to 



Rub{Du D2) = -]- log(si + af) + minj-^ log(l + (^2 - s^)VkR[X\Y,, W^]) 
Z Wi K z 

-ilogmin((VAR[X|ri,W^i]-i + S2-si)"\ D2)]. 



(20) 



The maximization over Wi in (|20l ) has a similar structure as the one previously considered in ([17] ). Specif- 
ically, Rkb{Di, D2) in daOl) is decreasing in VAR[X|Fi, Wi]. Hence Wi is optimal when VAR[X|Fi, Wi] 
is increased until it meets one of its upper bound constraints 

max VAR[X|Fi, Wi] = min(VAR[X|Yi], Di) (21) 

Wi 

where the first term in (|2TI) follows from the non- negativity of I{X;Wi\Yi), and the second one from the 
distortion constraint on Xi 

VAR[X\Y,,W,] = E[{X-MyuW,)Y] < D,. (22) 

Next, we consider the construction of Wi, W2 that achieves the rate-distortion function, namely, jointly 
Gaussian random variables with conditional variances that satisfy (fTTI) . (|2T|) . We construct the optimal 
distribution W^,W2 as follows: 

= aiX + Ni (23) 
W; = a2X + N2 (24) 



6 



where Ni ~ i.i.d. A/'(0, 1), i = 1,2, is independent from X,Yi,Y2, and ai,a2 are scalars whose values 
are to be specified. For notational convenience, we define 

Vi =VAR[X\Yi,W^] (25) 
= min((a^2 ^ ^^^-i^ (26) 

where (l26l) follows from (|2T]) . Substitute (l23l) in (l25l) . and ai evaluates to 

ai = V,-' - - si. (27) 
Similarly, to identify the optimal we define 

V2 = VAR[X|y2, W^r, W;] (28) 
= mm{{V,-' + S2-s^y\ D2) (29) 

which follows from ([IT]), ([IS). Substitute ([241) in ([28]), and 02 evaluates to 

a^ = V-^-V^-^-{s2-s^). (30) 

To provide an interpretation regarding the source encoding rates under different fading states of the 
side-information channel, we introduce the notations 

R^^I{X-Wl\Y{) (31) 

R2 = I{X-W*\Y2,Wl) (33) 

(34, 

/ V2 

where (|32|) . (|34|) follow from expanding the mutual information expressions applying (l27l) . (l30l) . We 
interpret i?i as the rate of a source coding base layer that describes X when the side-information quality 
is that of Yi or better. On the other hand, R2 is the rate of a top layer that describes X only when the 
decoder has the better side information Y2. Finally, we substitute (|32l) , (|34l ) in ([7]) to obtain the two-layer 
Heegard-Berger rate-distortion function 

RnB{D^.D2) = Ri + R2 (35) 
= -i log(a3^2 + si) - 1 log - ^ log(l + {s2 - si)yi) (36) 

where Vi, V2 are as defined in (|25] ). (|28] ) above. The derivation of (|36l ) depends on the side information only 
through the marginals p(?/j|x)'s; therefore, the rate-distortion function applies as well to the stochastically 
degraded side information Y^j, . . . ,Y{. 

B. Optimal Distortion Trade-off and Rate Allocation 

Under a source-coding rate constraint of Rx, the Heegard-Berger feasible distortion region is described 

by 

V{Rx) = {(/^i, D2) I i?HB(/^i, D2) < Rx]. (37) 

The distortion regions under different values of Rx are illustrated in Fig. |2l Setting i?HB(-Di, D2) = Rx, 
the dominant boundary of {(-Di, -D2)} defines the Pareto optimal trade-off curve (shown in bold in Fig. ^ 
between the two distortion constraints on Xi and X2, which is given by 

D2= [e^''Hc7^^ + si){l + {s2-si)D,)y' (38) 



7 



over the interval 

We find the optimal operating point on the Pareto curve to minimize the expected distortion 

E[D1* = min piDi + P2D2. (40) 

Di,D2:Rjib{Di,D2)<Rx 

In Section IIII-Cl it is shown that the above minimization is a convex optimization problem. Hence the 
Karush-Kuhn-Tucker (KKT) conditions are necessary and sufficient for optimality. Moreover, V(Rx), 
being the sublevel set of a convex function, is a convex set. After substituting (|38l ) in (|40l) . from the KKT 
optimality conditions, we obtain the optimal base layer distortion 

where (x)[a,b] denotes the projection 

{x)[a,b] — min(max(a, x), 6) (42) 
and the distortion and its boundaries are given by 

D^^ie'^-{a-/ + s^))-' (43) 

Dt ^ (a^^ + s,r\ (45) 
The optimal top layer distortion D2 is given by 

^2 = (^2)p-,^.] (46) 

where 

D,^{e'^-{a^' + S2))-' (47) 

D2 = [e'^'HcT^' + si){s2 - sMpiy'^' (48) 

D^^{e'''-ia],' + s^) + S2-s^y\ (49) 



S2- Sl 



S2 - Si 



(44) 



The corresponding optimal rate allocation i?*, can be found as given in (|32l) . (I34l) . 

The optimal rate allocation and the corresponding minimum expected distortion are plotted in Fig. [3] 
and Fig. |4l respectively, for Rx = I, aj^ = 1, and si = dB. Note that R2, the rate allocated to the 
top layer, is not monotonic with the side-information channel condition. As fading state S2 improves, R^ 
increases to take advantage of the better side-information quality. However, when S2 is large, R2 begins 
to decline as the expected distortion is dominated by the worse fading state. In addition, the optimal rate 
allocation is heavily skewed towards the lower layer: R2 > only when p2 is large. 



C. Multiple Discrete Fading States 

The rate-distortion function (l36l) extends directly to the case when the side-information channel has 
multiple discrete fading states: S = Si with probability pi, where 2 = 1,..., M, with < si < ■ • • < sm, 
and M > 2. The Heegard-Berger rate-distortion function for M > 2 can be characterized by a similar 
representation as that given in (|36l) for M = 2. Specifically, we construct the optimal distribution for the 
auxiliary random variable W*'s to be given by 



W* = aiX + Ni, i = l,...,M 



(50) 
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where Ni ~ i.i.d. A/'(0, 1), and a/s are scalars whose values are to be specified. The rate of the ith layer 
is 

R,^I{X;W:\Y„W*,...,WU) (51) 

= 2 log (52) 

where 

Vi^VAR[X\Yi,W*,...,W:] (53) 

= mm{{Vr\ + - s,^,)-\ A) (54) 



and So = 0, Vq = o"^ for convenience in notations. In the above, (1541) follows from the non-negativity of 
/(X; Wi\Yi, Wi, . . . , Wi-i) and the distortion constraint Q. The that achieves (|53l) is determined from 
(|54|) . which evaluates to 



ai = Vr' -Vr\-{si-Si.i). (55) 
As -Rhb(D) = -Rj' we substitute (|52l ) in (Hj) to obtain the rate-distortion function 

1 1 1 ^''^ 

i?HB(D) = -- log(a^=^ + ^i) - 2 ~ 2 5Z + ~ ""^^^^^ ^^^^ 

1=1 



where the V^'s are as given in (1541) . 

Under multiple fading states, however, a closed-form expression for the minimum expected distortion 
E[D]* does not appear analytically tractable. Nevertheless, the expected distortion minimization in ([3]) can 
be formulated as the following convex optimization problem: 

minimize J{Di, . . . ,Dm) (57) 
over D,,...,Dm,Vi,...,Vm eR++ (58) 
subject to 

1 1 1 

- - log((T^2 + si) - - log Vm--J2 log(l + (^^+1 - < (59) 

1=1 

V, < {Vr\ + Si- s,_,)-\ i = l,...,M (60) 
Vi<Di, 2 = 1,...,M (61) 

where R_|_+ denotes the set of positive real numbers. In (1571) above, the cost function J(-) may be any 
arbitrary function that is convex in Di, . . . , Dm- The constraint (l59l ) prescribes the feasible Heegard-Berger 
distortion region under the source-coding rate constraint Rx- The constraints (l60l) and (|6T1) derive from 
writing out the two upper bounds for each Vi, as described in (l54l) . as two separate inequality constraints. 
The equality in (l54l) may be written as inequality constraints since there is an optimal solution where for 
each i at least one of (l60l ) or (16T1 ) is tight. Specifically, the left-hand side of the Heegard-Berger constraint 
in (l59l ) is monotonically decreasing in V^'s. Hence for a given optimal {V*, D*}, if neither (l60l ) nor (16T1 ) 
is tight, may be increased to strictly enlarge the feasible set of {Di, . . . , Dm, Vi, . . . , VA/}\{-Di, Vi}. 



Proposition 1. The minimization given in l\57v-l\61[) is a convex optimization problem. 

Proof: Each of the inequality constraints in (I59l)-(|6T1) is convex: i.e., it is of the form 

c,(Di, ...,Dm,Vi,..., Vm) < Cc{Di, ...,Dm,Vi,..., Vm) (62) 
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where Cx{-) is convex in Di, . . . , Vm, and Cc(-) is concave in Di, . . . , Vm- In particular, in (|60l ). the right- 
hand side of each inequality constraint depends on only Vi^i. Being twice-differentiable, its concavity 
can be verified by the second-order condition 

(Vi_i + Si - =- ^ -3- (63) 



dV^,''-' (1 + ..-1)1^.-1) 

which is negative since Si > Si-i, Vi-i > as given in the problem formulation, for i = 2, . . . , M. 
Therefore, in (|57l) - (|6TI) . we minimize a convex function subject to a set of convex inequality constraints, 
which is a convex optimization problem. ■ 

Convexity implies that a local optimum is globally optimal, and its solution can be efficiently computed 
by standard convex optimization numerical techniques, for instance, by the interior-point method [|24ll . 
II25I . Moreover, the optimization problem (|57l) -(l6TI) has 2M variables and 2M+ 1 inequality constraints, 
which are linear in the number of side-information channel fading states M. 

In the case where the cost function J{Di, . . . , Dm) is non-decreasing in each component Di, the 
constraints (|6T1) may be taken as tight: if at their optimal values V* < D*, then D* may be decreased 
without violating feasibility nor increasing the cost function. In particular, we consider minimizing the 
expected distortion: J(Di, . . . , Dm) = E[^] = p'^D, in which case the optimization problem can be 
specified more compactly as 

minimize pi-Di H VpmDm (64) 

over Di,...,Dm e R++ (65) 
subject to 

1 1 1 

- - log(a^' + " 2 ~ 2 ^ ^ ~ ^'^-^'^ - ^^^^ 

1=1 

Di < + Si^i)-\ t = l,...,M (67) 

where in (l67l) similarly Dq = aj^. In the following, we characterize the KKT optimality conditions for 
the expected distortion minimization problem. First, we form the Lagrangian 

M M 



LCD, A, /i) = J^p.A + 5^ A, (a - {Di\ + s,- s,_i)-i) 

i=l i=l 

1 11 

+ log((T^^ + Si) - - log A/ - 2 5Z + - Si) A) - Rx 

1=1 

where A = [Ai . . . Aj\/]^, and fi, A are the Lagrange multipliers, or dual variables, associated with 
inequalities (l66l) . (l67l) . respectively. At optimality, the gradient of the Lagrangian vanishes: 



= TTTT =Pi + Ai - ^ — ^-o^^ V7T^ t = l,...,M -1 (69) 

dDi (^1 + (^Si+i- Si)Di) 2 1 + (si+i - Si) A 

= =PM + Am - (70) 

ODm 2 Dm 

and the complementary slackness conditions hold: 

= A, (a - (A-i + Si-i)-') , z = 1, . . . , M (71) 

1 11 '^^'^ 

= /i(--log(a^2 + ~ 2 - 2 5Z + (s.+i - Si) A) - Rx). (72) 
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Note that (|69l)-(|72l) represent 2M + 1 equations in 2M + 1 variables. The primal feasibility conditions 
are given by (|66l) . (|67l) . and the dual feasibility conditions are 

/i>0, Ai>0, i = l,...,M. (73) 

Together, (|69l)-(r72l). (|66l ). (|67] ). and (1731) are the necessary and sufficient conditions for optimality in the 
convex problem (I64l)-(l67l). 



IV. Rate Allocation Under Different Fading Distributions 

In this section, we apply the optimization framework developed in Section IIII-Cl and study the optimal 
rate allocation when the side-information channel is subject to different fading distributions. We first 
consider the scenario when the side-information channel experiences Rician fading, the pdf of which is 
given by 



{l + K)e-^ ( {\^K)s\ ( K(l + K)s\ 
fcis) = ^ J exp - ^ Jo 2a/ ^ ^ M , s>0 (74) 



S "V S J \ \ S 

where /o(-) is the modified Bessel function of zeroth order, and S is the mean channel power gain. 
The Rician fC-factor represents the power ratio of the line-of-sight (LOS) component to the non-LOS 
components. Specifically, (fMl) reduces to Rayleigh fading for K = 0, and to no fading (i.e., constant 
channel power gain of S) for K = oo. We discretize the channel fading pdf into M states 

Pi = PrjSide information channel state Sj is realized} (75) 



Si + l 



f{s)ds, i = l,...,M (76) 

where we truncate the pdf at sm- The quantized channel power gains are evenly spaced: Sj = (z — 
l)sjv//(Af — 1), i = 1,. . . , M, and sm+i — oo. In the numerical experiments, the convex optimization 
problems are solved using the primal-dual interior-point algorithm described in [25. Sectoin 11.7]. The 
optimal rate allocation that minimizes the expected distortion E[Z}] is shown in Fig. [5] and Fig. [6l 
respectively, for different values of K and Rx with M = 150. For comparison, we also show in the 
figures the optimal rate allocation under Nakagami fading with the pdf 

(m I Q\m, „m~l p-ms / S 

1 [m) 

where r(-) is the gamma function. In Fig. [5] and Fig. [6l the Nakagami parameter m is set to be: m = 
{K + lY /{2K + 1), under which the Nakagami distribution ([771) is commonly used to approximate the 
Rician distribution in (1741) [26] . 

In each case of the numerical results, it is observed that the optimal rate allocation is concentrated at 
a single layer, i.e., R* = Rx for some i = i* at Sj., while R* = for all other i ^ i*. For instance, 
the optimal primal and dual variables D*,\* are plotted in Fig. |7] for the case of Rician fading with 
K = 32, S = I, Rx = 0.25, aj^ = 1- In this case, the rate allocation concentrates at Si* ~ 0.55, and the 
complementary slackness condition (TtT]) stipulates that the corresponding dual variable be zero: Aj. = 0. 
In Fig. [51 under Rayleigh fading {K = 0), the optimal rate allocation concentrates at the base layer 
(i.e.. Si* = 0) of the source code. In the case where the side-information channel has a prominent LOS 
component, i.e., when K is large, Sj. increases accordingly as the channel distribution is more concentrated 
around S. On the other hand, a large source-coding rate Rx decreases Sj., which implies that it is less 
beneficial to be opportunistic to target possible good channel conditions when Rx is large. Moreover, for 
each S, Nakagami fading results in a higher Sj* than its corresponding Rician fading distribution. 

The minimum expectation E[D]* that corresponds to the optimal rate allocation is shown in Fig. [8l 
For comparison, along with E[D]*, in Fig. [8] we also show the distortion under different assumptions on 
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Fig. 5. Optimal rate allocation that minimizes the expected distortion E[D]. The rate allocation corresponding to Rician fading is shown 
in bars, and the one corresponding to Nakagami fading with m — (K + l)^ /{2K + 1) is shown in lines. In each case, the optimal rate 
allocation is concentrated at a single layer. (Rx = 1, S = 1, sm ~ 2S, Rx ~ 1, (^x ~ 1' ^'^ ~ 150). 
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Fig. 6. Optimal rate allocation that minimizes the expected distortion E[D] with K — 16 under different values of Rx (the other parameters 
are the same as those in Fig. |5}. In each case, the optimal rate allocation is concentrated at a single layer. 
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Fig. 7. Optimal primal and dual variables in expected distortion minimization under Rician fading with K — 32, S — 1, Rx ~ 0.25, 
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Fig. 8. Minimum expected distortion. The dash-dot line corresponds to the rate-distortion function with no side information (No SI). The 
dashed line (K — oo, i.e., the side-information channel has no fading) corresponds to the Wyner-Ziv (WZ) rate-distortion function. (S — 10, 

SM = 2S, a\ = 1, A/ = 150.) 



the side information. When no side information is available, the distortion is given by the rate-distortion 
function for a Gaussian source |[27l 

/^No-si(i?x) = crie-2^^. (78) 

In the absence of side information, -Dno-si is an upper bound to E[D]*. On the other hand, when 7^ = 00, 
there is no uncertainty in the side-information channel condition with S = S, and the distortion is given 
by the Wyner-Ziv [IJ rate-distortion function 

DMRx) = (^x' + 5)-^e-2^^. (79) 

In Fig. [8l a larger K decreases the expected distortion E[D]*, and Nakagami fading has a lower E[Z}]* 
than the corresponding Rician fading distribution. In addition, when Rx is small, E[D]* considerably 
outperforms -Dno-si where no side information is available, as the reduction in VAR[X] from the side 
information at the decoder is significant. However, when Rx is large, the improvement of E[D]* over 
-Dno-si diminishes, as most of the reduction in VAR[X] is due to the source-coding rate of Rx- 

In the numerical experiments considered above, it appears the optimal rate allocation concentrates at a 
single codeword layer under a wide class of fading distributions. In Section |Vl we consider the case where 
the side-information channel fading distribution has a continuous and quasiconcave distribution. We show 
that a continuous rate allocation R{s) over a continuum of codeword layers is indeed not necessary, and 
a single-layer discrete rate allocation is optimal. 

In the following, we make a remark on the distortion exponent A, defined similarly as given in [|28l , 
which characterizes the rate of exponential decay in distortion at asymptotically large encoding rates: 

A A J^^mML (80) 

Rx^oo 2Rx 

where Rx is the source-coding rate, and E[D{Rx)]* is the corresponding minimum expected distortion 
under Rx- We note that the distortion exponent A does not depend on the fading distribution f{s), since 

DwziRx) < nD{Rx)Y < D^^siiRx) (81) 

Rx^ca 2Rx Rx^oo 2i?x 

Therefore, reducing the side-information channel uncertainty (e.g., via deploying multiple antennas or 
through channel state information feedback) may reduce the expected distortion E[D{Rx)]* at finite Rx, 



14 



but it does not improve performance in the asymptotic regime in terms of the rate of exponential decay 
as a function of the encoding rate Rx- 

V. Continuous Fading Distribution 

A. Infinite-Dimensional Expected Distortion Minimization 

In this section, we investigate the optimal rate allocation and minimum expected distortion when the 
fading distribution of the side-information channel is continuous. We assume the set of fading states are 
given by {0, As, 2As, . . . }, and consider the limiting processing in which As — t- 0. Let /(s) be the pdf 
of the fading side-information channel, then the expected distortion is 

poo 

E[D] = / f{s)D{s) ds (83) 
Jo 

where the distortion function D{s) represents the realized distortion should the side-information channel 
take on the fading state 5 = s, s > 0. Note that -D(O^) = a\ and D{s) > 0. Since distortion decreases with 
the side-information channel strength, -D(s) is decreasing. Consequently, -D(s) is a function of bounded 
variation, and -D(s) is differentiable almost everywhere. Over the region where -D(s) is differentiable, the 
difference equation in (|52|) converges to 

V ^ ms)-D'is)Asy^ + As)-' 

and discontinuities in D{s) are represented by corresponding Dirac deltas in R(s). Suppose D(s) is 
nondifferentiable at {si, . . . , s„}. The total source-coding rate is given by 

; Ris) . -i g Z,(.) + ^ * + i g log (86, 

where sq = 0, s„+i = oo, and the last term in (l86l) follows from (|52l) . We note that 

'§4<^.^logfi (87, 

D[s) D[a) 

and (l86l) thus simplifies to 

/ R{s) ds = -I log (j-'^ - I lim flog D{z) + [ D{s) ds) . (88) 
Jo 2 2 z^oo V Jo ^ 

The infinite-dimensional optimization problem then is stated as follows: 

poo 

minimize / f{s)D{s) ds (89) 
Jo 

over D{s) > 0, D(0") = (90) 
subject to 

- ^ log (7^2 _ 1 (log D{z) + [ D{s) ds) < Rx (91) 

Z Z z^oo \ J ^ 

D\s) < -D\s) (92) 



where (|92|) is the infinite-dimensional counterpart of (|67l) 



\im^ D{s) - {{D{s) - D'{s)As)-^ + As) =D'{s) + D\s). (93) 
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Proposition 2. The infinite-dimensional expected distortion minimization problem d^-fl9?l) is a convex 
optimization problem. 

Proof: In (f89l ). the functional is linear in D{s). In (|9T| ). — logD(2;) is convex, and the integral is 
linear in D{z). In (l92l) . -D'(s) is linear, and — is concave in D{s). ■ 
In the following, we characterize the KKT optimality conditions. First, we form the infinite-dimensional 
Lagrangian [|29l 



noo POO 

L(D(s),A(s),/i) = / fis)D{s)ds+ X{s){D'{s) + D\s)) ds 
Jo Jo 

+ fi (^-Uog a - ^ \im (log D{z) + Z}(s) ds) - , 



(94) 



Suppose D{s) is continuously differentiable over the region S. At optimality, the functional derivative 
of the Lagrangian vanishes, which corresponds to the following two sets of conditions tf30ll . First, the 
Euler-Lagrange equation holds 

dL d rdL 



3D dsKaD'J' -'^^ 
Lis, D,D')& f(s)D(s) + Ma) (D'(a) + D^(a)) - (96) 

Second, the following boundary conditions are satisfied 

H{s-) = H{s^), seS^ (97) 
H{oo) = (98) 

where H(s) is the Hamiltonian defined by 

His)^D'is)^{s)-L{s). (99) 
Complementary slackness in the KKT conditions are characterized by 

= X{s){D'{s) + D\s)) (100) 

= fii-- log a -- \im (log D{z) + [ D{s)ds)-Rx]. (101) 



2 ° ^ 2z^ 



oo 



The primal feasibility conditions are as described in ( |9T1) . (I92l ). and the dual feasibility conditions are as 
follows: 

fi>0, X{s) > 0, s > 0. (102) 

Together, (|9T| ), (|92| ). (|95])- (|102I) are the necessary and sufficient conditions characterizing the solution of 
the convex optimization problem (f89l)-(l92l). For reference, we note that the Euler-Lagrange equation (|95] ) 
evaluates to 

X'{s) -2D{s)X{s) = f{s) - 12/2, seS (103) 
which is a first-order linear differential equation in A(s). The Hamiltonian (l99l) evaluates to 

His) = {11/2 - f{s))D{s) - X{s)D\s) (104) 
under which the boundary condition (|98] ) implies 

/i/2= lim X{s)D{s) + f{s). (105) 
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B. Optimality of Single-Layer Discrete Rate Allocation 

When the fading distribution of the side-information channel is continuous, in general, we need to 
consider the possibility of a continuous rate allocation function R{s) over the continuum of codeword 
layers. In the following, however, we show that a single-layer discrete rate allocation suffices for a wide 
class of channel fading distributions. Specifically, we focus on a class of channels with quasiconcave 
fading distributions, or referred to as unimodal distributions, and show that single-layer rate allocation is 
optimal. (A function g{x) is quasiconcave if its superlevel sets {x \ g{x) > a}, for all a, are convex.) 
Most common wireless channel fading distributions, e.g., Rayleigh, Rician, Nakagami, log-normal, are 
quasiconcave. Informally, a quasiconcave fading distribution corresponds to one where the most probable 
channel realizations reside over the support of a single contiguous interval. Under such fading distributions, 
it suffices for the encoder to target only a single side-information channel condition. 

Proposition 3. In l[89\i-l[92\i. suppose the pdf f{s) is continuous and quasiconcave, then the rate allocation 
R*{s) = RxS{s — Sa) is optimal for some Sa > 0. 

Proof: In the following, we provide an explicit construction of a single-layer rate allocation scheme, 
and show that it satisfies the KKT optimality conditions (|9T1) . (|92| ). (|951)- (I102I) . Suppose f{s) is quasicon- 
cave, and we denote its superlevel set by the interval [sa, Sb] = {s \ f{s) > /i(i)/2} for some nonnegative 
constant > 0. The subscript in parenthesis is a mnemonic for single-layer allocation. We concentrate 
all rate at Sa, i.e., R{i){s) = RxS{s — Sa), and the corresponding distortion function is 

(106) 
(107) 

S-Sa + Z^r'(Sa)e2^-' ^^^^^ 

Next, we construct X(i){s) to be the solution of the differential equation (11031) , with the boundary condition 

A(l)(Sa) = 

X^,){s) = w;\s) [ w;,(t)(/(t)-/i(i)/2) rft (109) 
where i = 1 for < s < Sa, i = 2 for Sa < s, and Wi(s)'s are the integrating factors 

wi(s) = exp [ -2Di{t) dt= ( ^±±^\ (110) 

w,{s)=exp f -2D,{t)dt={ .^. V- (111) 

To satisfy the boundary condition (|981) . we substitute (11061) . (11091) in (|105l) . Note that being a probability 
distribution, \ims^oof{s) = 0. Then /^(i)/2 simplifies as follows: 

. poo 

^= / wis)fis)ds (112) 

w{s) = - ■ „p • (113) 

Recognizing /(sa) = Ai(i)/2, (II 121) can be solved numerically. By construction, the KKT conditions (|9T| ). 
(|92| ), (|95])- (|101I) are satisfied. Finally, we verify the dual feasibility conditions (11021) . With f{s),Wi{s), w{s) > 
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Fig. 9. Optimal rate allocation under Rician fading {S = 1, ax=l). The single-layer rate allocation is given by R{s) — RxS{s — Sa). 



0, it follows > 0. Note that A(i)(s) > for s < Sb since /(s) < /U(i)/2 for s < s^, and /(s) > /i(i)/2 
for Sa < s < Sb. For s > Sb, A(i)(s) is decreasing, but we show that it never descends below zero 



lim W2(s)A(i)(s) 



W2{t)f{t) dt~ 



W2{t) dt = 0. 



(114) 



For example, the Rician fading distribution is quasiconcave, and its optimal rate allocation is plotted in 
Fig. lUfor different values of K and Rx- The single-layer rate allocation is completely specified by the 
codeword layer target Sa, which is computed by numerically solving (II 12|) . Consistent with the results 
in Section ITlI-C[ we note that s^, increases with K, but decreases with Rx- The corresponding minimum 
expected distortion is given by 



E[D 



(1) 



fis) 



ds + 



Sa + Sa + 



X 



ds. 



(115) 



Let us consider another example under Rician fading (K = 32, Rx = 0.25, S = 1, cr^=l), and we illustrate 
its KKT optimality conditions graphically in Fig. \T0\ In the figure, [sa, Sb] is the yU(i)/2-superlevel set 
of /(s), and the regions between f(s) and /i(i)/2 are shaded and labeled (a) and (b), respectively, for 
Sa. < s < Sb and s > Sb. From (II 141) . the dual feasibility condition on A(i)(s) corresponds to the area 
of (a), weighted by W2{s), being equal to the area of (b), weighted by W2{s). Note that increasing yU(i) 
shrinks (a) but enlarges (b), and vice versa when is decreased. In the example in Fig. \Wi 
A(i)(s) are consistent with their finite-dimensional counterparts that were shown in Fig. U\ and their plots 
are thus not repeated. 

For Rayleigh fading, its pdf is given by 

Ms) = il/S)e^/', s>0 (116) 

where S is the average channel power gain. Recognizing that any nonempty superlevel sets of /r(s) 
begins at Sa = 0, we have the following corollary. 

Corollary 2. For side-information channels under Rayleigh fading with pdf U16\) , the optimal rate 
allocation is -Rr(s) 



Rx^is), and the corresponding minimum expected distortion is 

Jo 7+^^^ 



E[D 



RJ 



ds = {l/S)e^/^Ei{C/S) 



(117) 
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Fig. 10. KKT optimality conditions under Rician fading (K = 32, Rx = 0.25, S — 1, ax = l). The dual feasibility of A(i)(s) corresponds 
to the area of (a), weighted by W2{s), being equal to the area of (b), weighted by W2{s). 



where C = a-^e^^^, and Ei(-) is the exponential integral 

poo -t 

Ei(x) = / — dt. (118) 

Jx t 

Hence, under Rayleigh fading, the source-coding scheme does not depend on S, Rx, and aj^. It is 
optimal to concentrate the entire encoding rate Rx at the base layer s = 0, i.e., the source is encoded as 
if the side information was absent. 

Therefore, when the side-information channel has a continuous and quasiconcave fading distribution, 
e.g., Rayleigh, Rician, Nakagami, log-normal, it is sufficient to design the source encoder to target a single 
channel condition Sa to minimize the expected distortion. The target Sa can be interpreted as the certainty- 
equivalent side-information channel condition that encapsulates the effects of the channel statistics on the 
expected distortion: i.e., the source is encoded using a single codeword layer as if the side-information 
had a fixed channel power gain Sa. Interestingly, this is contrary to the optimal resource allocation for 
maximizing expected capacity or minimizing expected distortion over a slowly fading channel [|2l, [fT3l . 
[fT4l . where, in general, for fading channels with continuous distributions, a continuum of codeword layers 
is necessary. 

Note that when the fading distribution f{s) is nonquasiconcave, it is possible that the expected distortion- 
minimizing rate allocation comprises multiple codeword layers. For instance, if f{s) has support that 
consists of two disjoint narrow intervals so as to resemble the two-state fading distribution considered in 
Section IIII-B[ then Rx indeed may need to be apportioned between two codeword layers as illustrated 
in Fig. |3l Nevertheless, Section IIII-BI suggests that adopting separate codeword layers is worthwhile only 
for fading distributions of pronounced disparity: i.e., the more favorable fading state needs to be highly 
probable and of sufficiently larger gain. On the other hand, a continuous quasiconcave fading distribution 
presents a single mode where it is most probable. Intuitively, such distributions do not represent disparate 
fading states, and hence separate codeword layers are not necessary. Under discrete fading states, however. 
Fig. |3] shows counterexamples of unimodality of the probability mass function (pmf) being a sufficient 
condition for single-layer optimal rate allocation. Additional regularity conditions on the pmf, e.g., the 
adjacent fading states not being overly dissimilar, may need to be imposed in order to characterize the 
discrete counterpart of the single-layer optimality sufficient conditions in Proposition [3l 

The techniques for source coding under fading side-information channels may also be applied to 
improve quantize-and-forward schemes [31J in wireless network transmissions, where the side information 
represents the auxiliary signals forwarded by a cooperating user as received via a fading channel. In those 
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cases where different distortion measures other than squared error are considered, however, different 
conclusions regarding the optimal number of source-coding layers may result. 

VI. Conclusions 

We considered the problem of optimal rate allocation and distortion minimization for Gaussian source 
coding under squared error distortion, when a fading side-information channel is present. The encoder 

knows the fading channel distribution but not its realization. A layered encoding strategy is used, with each 
codeword layer targeting the realization of a given fading state. When the side-information channel has two 
discrete fading states, we derived closed-form expressions for the optimal rate allocation among the fading 
states and the corresponding minimum expected distortion. The optimal rate allocation is conservative: rate 
is allocated to the higher layer only if the better fading state is highly probable. Otherwise the potential 
reduction in distortion, from exploiting the more favorable fading state, is not sufficient to compensate 
for the worsened distortion that results from reducing the encoding rate for the base layer. For the case 
of multiple discrete fading states, the minimum expected distortion was shown to be the solution of a 
convex optimization problem. In particular, we derived an efficient representation for the Heegard-Berger 
rate-distortion function, under which the number of variables and constraints in the optimization problem 
is linear in the number of fading states. 

When the fading distribution of the side-information channel is continuous, we minimized the expected 
distortion by formulating it as an infinite-dimensional convex optimization problem. For quasiconcave fad- 
ing distributions, e.g., Rayleigh, Rician, Nakagami, and log-normal, we showed that a single-layer discrete 
rate allocation is optimal. In particular, under Rayleigh fading, the optimal rate allocation concentrates 
at the base layer: i.e., the source is encoded as if the side information was absent. This is in contrast to 
maximizing expected capacity or minimizing expected distortion under slowly fading channels, where in 
general a continuum of codeword layers is necessary. For practical source coding schemes, the results of 
this paper suggest that the encoder only needs to adopt a single codeword layer that targets a particular 
side-information channel condition, which is interpreted as the certainty-equivalent channel power gain 
that encapsulates the effects of the fading statistics. 
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